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I.  INTRODUCTION 


Evaluation  techniques  were  devejoped  to  assess  computer  simulation  of 
chemical  simulant  munition  field  trials.  These  techniques  were  used  to  quantita¬ 
tively  compare  the  predicted  to  experimental  deposition  data  for  single  munitions. 
The  methods  employed  for  evaluating  single  munitions  are  not  appropriate  for 
the  multiple  munition  field  trials  currently  being  conducted.  Hence,  the  data 
fitting  techniques  were  re-examined  and  a  new  methodology  developed  that  could 
be  applied  regardless  of  the  number  of  munitions  tested.  A  brief  background  dis¬ 
cussion  is  provided  to  aid  in  understanding  the  intended  purpose  of  this  analysis. 


II.  BACKGROUND 
1.  Chemical  Munition  Analyses 

a.  Single  Munition.  In  previous  analyses,  a  single  chemical  munition  was 
under  investigation.  A  computer  code  called  BIND  was  used  to  compare  the  com¬ 
puter  generate'!  ground  deposition  to  the  experimental  deposition  data.  Data 
analysis  techn-  ;•  cs  in  BIND  utilize  metrics  which  considered  both  area  and 
geometry  of  the  two-dimensional  pattern  on  the  ground  at  user  specified  deposi¬ 
tion  levels.  These  deposition  patterns  are  referred  to  as  contours  (iso-deposition 
curves).  The  geometry  of  the  pattern  was  described  in  terms  of  the  major  axis 
which  was  defined  as  the  length  of  the  pattern  measured  in  the  downwind  direc¬ 
tion.  BIND  also  provided  graphical  output  of  the  predicted  and  experimental 
deposition  contours.  Visual  inspection  of  these  contours  at  the  user  specified 
deposition  levels  aided  in  assessing  the  goodness  of  the  computer  prediction. 

b.  Multiple  Munitions.  In  analyzing  multiple  munition  field  trials,  it  was 
determined  that  the  metrics  used  for  the  single  munition  analysis  were  not 
appropriate.  For  contours  generated  from  multiple  munitions,  the  major  axis  is 
not  clearly  defined  because  the  shape  is  subject  to  the  relative  placement  of  one 
munition  to  another.  Overlaying  of  contours  from  mnltiplp  munitions  must  be 
considered  when  describing  the  shape  of  the  entire  pattern;  the  major  axis  does 
not  describe  this  phenomena.  Therefore,  other  means  of  evaluating  the  computer 
prediction  were  investigated  through  the  use  of  additional  quantitative  tests. 
These  tests  were  implemented  in  an  interactive  graphic  computer  program  called 
MULTI 


1.  KJopcic,  J.  Terrence  and  Hindman,  Tracy  P.,  Deposition  Modeling  and  Effectiveness 
Evaluation  Using  Chemical  Weapon  Fteld  Trial  Dispersion  Data,  U.S.  Army  Ballistic 
Research  Laboratory,  Technical  Report  BRL-MR-3652,  March  1988,  (Unclassified). 
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2.  Experimental  and  Predicted  Data 

To  evaluate  the  fitting  procedures,  experimental  and  computer  generated 
data  for  eleven  field  trials  were  examined.  Experimental  data  collected  from  field 
trials  conducted  at  Dugway  Proving  Ground  (DPG)  included  dissemination  data, 
optical  and  radar  tracking  data  of  the  munition  from  time  of  function  to  impact, 
and  meteorological  data.  Dissemination  data  can  be  collected  in  various  ways. 
For  this  analysis,  data  collection  consisted  of  a  50  by  50  array  of  filter  paper 
samplers  placed  100  meters  apart  on  a  3  mile  by  3  mile  grid,  as  shown  in  Figure 
A*1  of  Appendix  A.  The  meteorological  data  included  temperature,  Pasquill  Sta¬ 
bility  category,  and  wind  speed  and  direction  as  measured  from  two  tower  loca¬ 
tions.  A  thirty  minute  summary  of  the  meteorological  data  along  with  supple¬ 
mental  data  was  reported. 


The  Non  Uniform  Simple  Surface  Evaporation  (NUSSE3)  model  was  used  to 
replicate  the  liquid  (simulant)  ground  ^  ‘tion  patterns  by  describing  the  t^n- 
sport  and  diffusion  of  chemical  ager’  )  into  the  lower  atmosphere.  A 

subset  of  DPG  data,  for  example  the  a*  ^ical  and  trajectory  data,  was  used 

to  obtain  the  necessary  inputs  for  N’  the  remaining  inputs  were  derived 

mathematically  from  the  initial  conditions  o.  the  test.  The  initial  set  of  NUSSE3 
inputs  was  referred  to  as  the  baseline.  For  example,  the  thirty  minute  summary 
of  the  meteorological  data  provided  the  baseline  meteorological  conditions.  Once 
the  baseline  was  determined,  the  process  of  generating  a  chemical  pattern 
representative  of  the  field  trial  was  begun. 


Visual  comparison  of  the  computer-generated  plots  between  the  baseline 
and  the  experimental  contours  proved  to  be  the  most  useful  tool  in  determining 
which  NUSSE3  inputs  should  be  varied  to  better  represent  the  experimental  data. 
One  set  of  NT.JSSE3  inputs  typically  varied  were  those  associated  with  meteorolog¬ 
ical  conditions.  For  example,  if  the  predicted  pattern  was  not  oriented  correctly, 
then  the  wind  direction  was  varied  based  on  values  recorded  by  DPG.  Data 
reported  at  several  time  intervals  provided  a  range  in  which  these  parameters 
could  vary.  Each  variation  in  parameters  was  evaluated  in  relation  to  the  DPG 
deposition  data  through  the  use  of  MULTI;  these  variations  were  referred  to  as 
excursions.  The  results  of  each  excursion  were  evaluated  using  visual  inspection 
and  quantitative  information.  This  process  was  continued  until  the  best 
predicted  to  experimental  comparison  (best-fit)  was  achieved. 
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2.  Saucier ,R.,  NUSSES  Model  Deteription,  U.S.  Army  Chemical  Research,  Development  and 
Engineering  Center,  Technical  Report  CRDEC-TR-87046,  May  1987  (Unclassified). 

3.  Saucier,  R.,  NUSSES  User's  Guide  and  Reference  Manual,  U.S.  Army  Chemical  Research 
and  Development  Center,  Special  Publication  CRDC-SP-86(X)9,  March  1986  (Unclassified). 
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The  focus  of  this  analysis  was  to  evaluate  new  quantitative  methods  appli¬ 
cable  to  both  single  and  multiple  munition  field  trials.  A  best-fit  criterion  was 
developed  based  on  the  results  of  this  analysis. 


III.  QUANTITATIVE  TESTS 

Several  statistical  tests  were  considered  for  evaluating  chemical  munitions. 
Hypothesis  tests  chosen  for  this  analysis  were  the  Sign,  Mann-Whitney  U,  and 
Chi-Square  test.  The  methods  used  for  these  tests  are  those  described  by  John¬ 
son.  In  addition,  a  modified  version  of  the  Chi-Square  was  developed;  this 
metric  is  referred  to  as  the  Point-by-Point.  Each  test  was  exercised  to  determine 
its  appropriateness  for  this  application. 

1.  Sign  Test 

The  nonparametric  Sign  test  is  used  as  a  hypothesis  test  of  the  median 
difference  of  two  populations.  For  this  analysis,  the  null  hypothesis  (//o)  that  was 
the  MJSSE3  model  provides  a  good  prediction  of  the  DPG  deposition  data;  in 
other  words,  the  median  difference  between  the  experimental  and  predicted  data 
was  zero.  In  this  analysis  the  Sign  test  was  applied  to  the  predicted  and  experi¬ 
mental  deposition  data  at  every  grid  location  for  a  user  specified  deposition  level. 
For  this  analysis,  deposition  levels  of  \mg/m^  and  greater  were  examined.  The 
difference  between  the  predicted  (NUSSES)  and  the  experimental  (DPG)  data  was 
converted  to  a  +  or  -  sign,  disregarding  the  datum  if  the  difference  was  zero. 
The  number  of  positive  differences  and  the  number  of  negative  differences  were 
summed.  The  smaller  of  these  sums  {x)  was  selected  and  used  to  derive  a  critical 
value  (r).  This  value  was  used  to  determine  whether  there  was  suflficient  evidence 
to  reject  the  Hq  at  a  certain  significance  level.  The  significance  level  chosen  for 
this  analysis  was  .05:  rejection  of  Hq  when  it  is  actually  true. 

The  determination  of  the  x  value  is  based  upon  sample  size.  For  sample 
size  n,  where  6  <  n  <  lOO,  z  is  taken  from  a  Critical  Value  Sign  table.  If  z  is  less 
than  z,  there  is  not  sufficient  evidence  to  reject  the  //,,  else  the  H,  is  rejected  at 
the  .05  significance  level. 


4.  Johnson, Robert  R.,  Elementary  5Jflti«tic<,Duxbury  Press, 1976 
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If  n  is  larger  than  100,  the  Sign  test  is  carried  out  by  means  of  a  normal 
approximation  using  a  standard  normal  variable  z,  where  z  and  z'  are  defined  as 
follows: 


2  —  n 

2 

^  0.5 

(1) 

z'  =  z  +  0.5 

(2) 

The  z  value  was  compared  to  the  range  from  — i  to  +(,  where  (  was  taken  from  a  t 
distribution  table,  (two-tailed  test)  at  significance  level  .05.  If  z  was  in  the  accep¬ 
tance  region,  -i  to  +f,  there  was  not  sufficient  evidence  to  reject  the  //q.  If  z  was 
in  the  critical  range  this  indicated  a  rejection  of  Hq  at  the  .05  significance  level. 

2.  Mann-Whitney  U  Test 

The  Mann-Whitney  U  test  is  a  nonparametric  hypothesis  test  for  the 
difference  between  two  independent  means.  The  H,  stated  that  the  mean 
difference  between  the  experimental  and  predicted  data  equals  zero.  To  apply  the 
Mann-Whitney  in  this  analysis,  both  data  sets  (predicted  and  experimental)  were 
placed  in  ascending  order  and  a  rank  number  assigned  to  each.  If  a  tie  existed, 
the  average  rank  was  assigned  to  all  data  involved.  The  sum  of  the  ranks  for  the 
predicted  data  (Sp)  and  the  sum  of  the  ranks  for  the  experimental  data  (S.)  were 
calculated.  A  U  value  was  calcula*  'd  with  the  following  formulas: 


,  , ,  ,  (",)('',  +  1) 

„-(np)(”,)+  2  “ 

(3) 

.  ,  W  N  . 

(4) 

where  n,  and  were  the  sample  sizes  for  the  experimental  and  the  predicted, 
respectively.  If  both  sample  sizes  are  less  than  or  equal  to  10,  the  smaller  of  the  U 
statistic  is  compared  to  a  critical  value  (r)  taken  from  the  Mann-Whitney  table  of 
critical  values.  The  values  n,  and  n,  must  be  greater  than  1  and  the  combined 
sum  of  n,  and  n,  equal  to  ten  in  order  to  use  this  test.  If  U  is  smaller  than  z  there 
is  not  sufficient  evidence  to  reject  H,;  otherwise,  H,  is  rejected  at  the  .05 
significance  level. 

If  either  sample  size  was  larger  than  twenty  or  both  were  larger  than  ten, 
the  critical  value  was  found  through  the  standard  normal  variable  z: 


(5) 


where  i'  was  the  ‘^mailer  of  U,  and  Lp.  fin  was  the  mean  of  the  sample  sizes,  and 
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ffu  was  the  standard  deviation  of  the  sample  sizes.  The  critical  z  was  compared  to 
the  range  from  ~i  to  +t  (twotailed  test)  at  significance  levei  .05  from  the  t  distri¬ 
bution  table.  If  2  was  in  the  acceptance  region,  there  was  not  sufficient  evidence 
tc  reject  the  Hq;  z  in  the  critical  range  rejects  Hq  at  the  .05  significance  level. 

3.  Chi-Square  Test 

The  Chi-Square  test  is  a  hypothesis  test  with  several  applications.  For  this 
analysis,  the  test  was  used  to  compare  frequency  of  occurrences  at  various  deposi¬ 
tion  levels.  The  H,  was  that  the  NCSSE3  model  provides  a  good  prediction  of  the 
DPG  deposition  data.  The  predicted  and  the  experimental  data  were  binned;  the 
initial  size  for  each  bin  was  10  units  {mg/m^).  A  minimum  of  two  bins  is  required 
to  perform  the  Chi-Squar.  test.  If  there  were  less  than  5  occurrences  for  either 
the  predicted  or  the  experimental  then  the  bin  was  expanded;  these  final  bin 
ranges  varied  from  one  excursion  to  another.  However,  the  ranges  for  the 
predicted  and  experimental  bins  were  forced  to  be  equal  to  allow  for  comparison 
(i.e.  if  bin  1  for  the  predicted  data  was  i  to  lOm^/m*,  the  same  was  true  for  the 
experimental).  Once  the  occurrences  in  each  bin  were  counted  this  formula  was 
applied; 


X‘ 


UE,-P,f 
k  P. 


(6) 


F  and  E  were  the  bin  counts  for  the  predicted  and  experimental,  respectively,  and 
N  was  the  number  of  bins.  The  critical  value  z  was  found  in  the  Chi-Square  dis¬ 
tribution  table.  If  was  greater  than  z  then  the  Hq  was  rejected  at  the  .05 
significance  level. 


4.  Point-by-Point  Test 

The  quantitative  test  developed  specifically  fo;  this  analysis  was  the  Point- 
by-Point;  a  is  a  modified  version  of  the  Chi-Squ''.re.^  The  average  at  each  sample 
point  was  considered  so  that  it  would  be  an  unbiased  comparison  between  the 
data  sets  and  the  denominator  was  squared  to  remove  the  problem  of  dimen¬ 
sionality.  Consequently,  this  test  evaluated  area  and  geometry  as  well  as  deposi¬ 
tion. 


A  predicted/experimental  comparison  was  performed  at  each  grid  position 
for  a  user  specified  deposition  level  for  either  data  set.  For  this  analysis,  the 
XmgirrP  level  was  choscn.  The  Point-by-Point  metric  is  determined  by  the 


5.  We  wish  to  thank  Ric;'  ■  cier  of  BRL  for  suggesting  this  metric. 


5 


following  equation: 


(n  - 


Pi  + 


(7) 


P  and  E  were  the  deposition  levels  at  a  particular  grid  position  for  the  predicted 
and  experimental,  respectively,  and  N  was  the  number  of  data  points  in  the  union 
of  the  two  samples.  The  value  of  x  ranged  from  zero  to  four,  with  zero  indicating 
an  identical  comparison. 


I\ .  RESULTS 

Each  quantitative  test  was  exercised  using  data  from  eleven  field  trials.  In 
order  to  evaluate  the  tests,  area  coverage  and  visual  inspection  were  performed  at 
the  1,10,40,  and  lOOmff/m^  deposition  levels.  The  primary  depositions  of  concern  for 
this  analysis  are  the  10  and  40mj/m^  levels  with  40mf/m*  having  precedence  over 
lOmj/fM*.  Less  consideration  was  given  to  the  lOOmj^/m^  level  because  the  test  grid 
was  tex)  coarse  to  accurately  portray  this  contour.  The  imj/m^  was  used  primarily 
to  show  the  overall  size  and  shape  of  the  pattern.  A  best-fit  criterion  was  esta¬ 
blished  based  on  the  evaluation  of  the  test  results. 

A  discussion  of  the  data  used  to  evaluate  these  tests  will  aid  in  understand¬ 
ing  the  results  that  arc  to  follow.  The  predicted  data  set  typically  contains  more 
data  points  than  the  experimental  at  the  lower  deposition  levels,  6my/m®  and  less. 
The  average  predicted  data  set  was  more  than  twice  the  size  of  the  experimental 
at  the  irny/m^  Considering  the  tests  and  the  way  in  which  they  were  applied, 
there  was  an  implicit  weighting  at  the  lower  deposition  levels  because  of  this. 

Subsets  of  the  data  were  used  to  demonstrate  the  effect  the  lower  deposition 
levels  had  on  the  test  results.  The  subsets  also  proved  useful  for  evaluating  the 
data  sets.  The  statistical  tests  were  first  applied  to  the  entire  data  set,  img/rn^ 
and  greater  and  then  to  subsets  of  the  data.  Subsets  examined  were  deposition 
levels  and  greater,  AOmg/m^  and  greater,  and  lOOmp/m®  and  greater,  in 

order  to  shift  the  emphasus  from  those  lower  levels  to  the  levels  of  most  concern. 
In  most  cases,  there  was  insufficient  data  at  the  lOOmp/m®  for  the  Sign,  Mann- 
Whitney,  and  Chi-Square  tests;  hence,  the  lOOmp/m^  was  not  considered  for  these 
tests. 
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1.  Hypothesis  Tests 


a.  Sign  Test.  With  the  exception  of  two  cases,  the  Sign  test  rejected  the 

H,  for  all  model  predictions  when  the  entire  data  set  was  under  consideration. 
Details  of  the  actual  deposition  levels  was  lost  as  a  result  of  considering  just  the 
sign  differences.  Hence,  the  Sign  test  did  not  provide  insight  into  the  predicted  to 
experimental  comparison. 

For  trial  N4,  the  baseline  z  value,  -1.22,  fell  inside  the  acceptance  region 
-1.96  to  +1.96,  implying  there  was  not  sufficient  evidence  to  reject  the  Hq,  that  is, 
the  NUSSE3  model  makes  a  good  prediction  of  the  DPG  data.  Figure  A-2  of 
Appendix  A  shows  the  overall  pattern  for  the  baseline.  (All  contours  plots  can  be 
found  in  Appendix  A.  The  crosshatched  contour  represents  the  experimental  data 
and  the  unfilled  contour  the  predicted.)  The  area  in  which  predicted,  but  no 
experimental,  data  exists  is  comparable  to  the  area  where  there  is  experimental, 
but  no  predicted  data.  Also,  the  number  of  +  signs  is  similar  to  the  number  of  - 
signs  where  the  patterns  overlap.  WTien  these  conditions  exist,  the  Sign  test  fails 
to  reject  Hq.  Little  insight  was  gained  by  considering  only  the  sign  of  the 
difference  between  the  predicted  and  experimental  data. 

The  above  conditions  did  not  exist  for  the  N4  excursion;  therefore  the  z 
value,  -2.73,  was  within  the  critical  range  and  the  H,  was  rejected.  Inspection  of 
Figures  A-2  through  A-7,  representing  the  baseline  and  an  excursion  contours  at 

I, l0,and40mj/fn^  levels,  indicates  that  the  excursion  provided  a  better  comparison 
than  the  baseline;  however,  the  results  of  the  Sign  test  did  not  support  this  con¬ 
clusion. 

Trial  A1  also  illustrates  why  the  results  of  the  Sign  test  were  not  particu¬ 
larly  useful  for  the  best-fit  criterion.  For  the  baseline,  the  z  value  was  -6.00  which 
was  within  the  critical  range.  The  excursion  z  was  also  within  the  critical  range, 
but  by  a  smaller  margin,  -4.03  Thus,  the  Sign  test  seems  to  indicate  a  better 
comparison  for  the  excursion  even  though  it  was  still  rejected.  Visual  inspection 
of  the  baseline  and  excursion  contours  at  i,  10,  and  40m^/m*,  Figures  A-8  through 
A-13,  Indicate  that  the  excursion  was  a  better  6t.  Since  the  Ho  was  rejected  for 
both  cases,  test  conclusions  were  unclear  for  these  baseline  and  excursion  com¬ 
parisons. 

As  shown  in  the  analysis  of  Trial  N4,  the  Sign  test  lost  the  detail  of  the 
actual  deposition  values.  In  general,  the  results  obtained  from  the  Sign  test  did 
not  provide  insight  into  wl:'’h  set  of  NUSSE3  parameters  gave  the  "best" 
representation  of  the  field  trial  data.  Hence,  the  Sign  test  was  not  considered  as 
part  of  the  best-fit  criterion. 
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b.  Mann-Whitney  U  Test.  The  Mann-Whitney  test  rejected  the  for 
all  trials  when  the  entire  data  was  examined.  Rejection  by  the  Mann-Whitney 
test  is  caused  by  a  large  delta  in  the  U,  and  the  t/,.  As  the  difference  in  the  U 
statistics  increases,  the  difference  between  V  and  the  mean  also  increases.  Upon 
examination  of  equation  5,  as  {/  (the  smaller  of  U,  and  V^)  decreases,  the  absolute 
value  of  the  2  value  increases  driving  it  further  frcm  the  acceptance  region  (-1.96 
to  +1.96).  WTiat  caases  U,  and  to  grow  apart  is  a  disproportionate  change  in 
the  sample  sizes  (n,  and  n,)  as  compared  to  the  rank  values  (5,  and  S,).  The 
predicted  data  set  typically  contained  more  data  points  at  the  lower  deposition 
levels.  It  was  this  clustering  of  the  data  points  which  caused  the  disproportional 
change,  resulting  in  rejection.  Comparison  of  the  patterns  at  the  higher  levels  of 
interest  may  be  good,  but  the  large  difference  in  the  number  of  data  points  at 

could  cause  rejection. 

Trial  N4  provided  a  good  example  of  the  problem  with  applying  the  Mann- 
WTitney  to  the  entire  data  set.  The  smallest  difference  between  the  baseline  and 
excursion  2  values  for  all  trials  occurred  in  N4.  The  Mann-Whitney  2  value  only 
■went  from  -6  56  to  -6.13  as  the  NUSSE3  parameters  were  varied;  that  is,  the 
excursion  was  oi;i  '  lightly  closer  to  the  acceptance  region.  However,  when  view¬ 
ing  Figures  A-2  through  A-7,  a  drastic  improvement  in  the  shape  of  the  patterns 
is  noted  from  baseline  to  the  excursion.  The  Mann-Whitney  2  values  did  not 
reflect  the  magnitude  of  improvement  in  pattern  shape  because  the  sample  sizes 
and  the  rank  values  varied  proportionally  to  each  other  in  the  baseline  and  excur¬ 
sion.  Therefore,  the  difference  in  V,  and  for  the  baseline  was  very  close  to  the 
difference  in  the  V,  and  for  the  excursion. 

The  excursion  of  N4  appeared  to  be  a  relatively  good  fit  (Figures  A-5 
through  A-7)  and  yet  the  Mann-Whitney  2  at  -6.13  was  well  outside  the  accep¬ 
tance  region.  This  was  attributed  to  the  ratio  of  to  n,  being  1.05,  whereas  the 
ratio  of  5^  to  S,  was  0  68.  Since  these  values  were  not  proportionate,  the 
significant  difference  in  and  U,  drove  2  into  the  critical  range. 

Due  to  the  ranking  of  the  data  for  the  Mann-Whitney  test  and  the  nature  of 
these  data  sets  the  Mann-WTiitney  test  did  not  give  results  decisive  enough  for  the 
best-fit  criterion.  Therefore,  the  Mann-Whitney  U  test  was  not  appropriate  as  a 
best-fit  criterion. 

c.  Chi-Square  Test.  The  Chi-Square  test,  which  evaluated  frequency  of 
occurrences  at  various  deposition  levels  by  binning  the  data  sets,  rejected  the  Ho 
for  all  cases.  A  NUSSE3  predicted  pattern  typically  has  a  larger  area  coverage 
than  the  DPG  pattern  at  the  lower  deposition  levels.  This  can  be  attributed  to 
many  factors,  from  insufficient  data  collection  to  over-simplified  simulation  of 
complex  phenomena  that  influence  the  shape  and  area  of  a  predicted  chemical 
dissemination  pattern.  This  discrepancy  caused  rejection  of  H„  regardless  of  the 
goodness-of-fit  at  higher  deposition  levels. 
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The  Chi-Square  test  was  not  able  to  evaluate  shape  for  the  chemical  pat¬ 
tern;  this  was  attributed  to  binning  the  data.  By  binning  the  data,  only  deposi¬ 
tion  levels  are  recorded  and  not  location.  Without  the  relative  placement  of  each 
datum,  shape  cannot  be  determined.  However,  the  Chi-Square  was  able  to  distin¬ 
guish  a  change  in  area;  again,  this  is  due  to  binning  the  data.  Since  frequency  of 
occurrence  for  each  deposition  level  is  determined,  area  can  be  evaluated. 


F'or  trial  N4,  the  x®  values  and  the  area  ratios  (predicted  to  experimental) 
did  not  change  significantly  from  baseline  to  excursion.  Listed  are  the  x®  values 
and  area  ratios  at  imj/m*  for  trial  N4: 


X 

Crilicai  Value 
Area  Ratio 


Baseline 

114.99 

11.1 

0.8 


Excursion 

111.87 

12.6 

0.9 


Contrary  to  these  results,  Figures  A-2  and  A-5  (baseline  and  excursion)  show  a 
considerable  change  in  the  shape  of  the  predicted  pattern.  Conversely,  the  x® 
values  and  the  area  ratios  changed  significantly  from  baseline  to  excursion  for 
trial  N6.  Listed  are  these  values  at  imj/m*  for  trial  N6: 


Critical  Value 
Area  Ratio 


Baseline 

288.96 

12.6 

3.6 


Excursion 

137.96 

11.1 

2.4 


However,  Figures  .\-14  and  A-15  show  only  a  small  change  in  shape  from  the 
baseline  to  excursion.  Consequently,  it  appears  that  area  was  the  only  factor 
which  influenced  the  Chi-Square. 

Thus,  it  was  concluded  that  for  this  analysis  the  Chi-Square  had  two 
shortcomings.  First,  more  data  points  existed  for  the  predicted  than  the  experi¬ 
mental  at  the  lower  deposition  levels.  Secondly,  due  to  the  binning  of  the  data, 
shape  was  not  an  influential  factor,  as  demonstrated  in  trials  N4  and  N6.  Based 
on  these  conclusions,  the  results  of  the  Chi-Square  test  were  not  considered  as  a 
best-fit  criterion. 

d.  Subset  Results.  As  previously  stated,  these  hypothesis  tests  were 
applied  to  subsets  of  the  data  which  shifted  the  emphasis  from  the  lower  deposi¬ 
tion  levels  to  other  levels  of  interest.  When  examining  the  above  hypothesis  tests, 
34  out  of  36  cases  resulted  in  rejection  of  the  Hq  at  img/m^  and  greater.  At  the 
lOrng/m"^,  29  out  of  36  resulted  in  rejection,  and  for  40mg/m^  only  12  resulted  in 
rejection.  Thus,  as  the  deposition  levels  of  interest  increased,  the  number  of 
rejections  decreased,  showing  the  effect  the  lower  deposition  levels  have  on  the 
test  results.  However,  even  though  the  number  of  rejections  decreases  at  these 
levels,  these  subset  results  do  not  give  enough  information  about  the  patterns  to 
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be  considered  sufficient  as  a  best-fit  criterion. 


In  conclusion,  the  Sign,  Mann-Wffiitney  and  Chi-Square  tests  were  not  used 
in  the  best-fit  criterion  because  they  did  not  decisively  address  area,  shape  or 
deposition  comparisons.  However,  each  test  does  address  certain  characteristics 
about  the  data  set  which  can  provide  specific  insight  to  the  analyst.  The  Sign 
Test  evaluates  the  median  difference  for  all  paired  data  sets  conveying  whether 
one  data  set  is  larger  in  overall  size  or  deposition.  Due  to  ranking  the  predicted 
and  experimental  data  together,  the  Mann- Whitney  Test  can  show  if  clustering  of 
data  at  specific  deposition  levels  occurs  for  either  data  set.  The  Chi-Square  Test 
gives  insight  to  hew  the  frequency  distributions  of  the  predicted  and  experimental 
data  compare.  However,  the  test  results  by  themselves  can  be  misleading.  There¬ 
fore,  to  better  understand  these  results  a  closer  look  at  the  actual  data  is  required. 
Depending  on  the  specific  details  needed  for  the  analysis  objective,  these 
hypothesis  tests  can  be  useful. 

2.  Point-by-Point  Test 

The  Point-by-Point  test  proved  to  be  the  most  useful.  This  metric  was  sen¬ 
sitive  to  changes  in  the  predicted  data  because  it  considered  area,  shape,  and 
deposition.  The  relationship  between  the  Point-by-Point  x  values  and  deposition 
ratio  is  shown  in  Figure  1  where  the  deposition  ratio  is  P/E  or  E/P,  smallest 
being  the  numerator.  For  each  individual  trial,  the  initial  NUSSE3  run  esta¬ 
blished  a  baseline  value  for  the  Poini-by-Point.  Statistical  values  for  subsequent 
excursion(s)  were  compared  to  this  baseline  value  to  determine  which  set  of 
NUSSE3  inputs  produced  a  best-fit. 

A  change  in  area  and  shape  for  the  predicted  data  was  reflected  in  the 
Point-by-Point  test.  For  trial  N8,  as  shown  in  Figures  A-34  through  A-39,  there 
was  a  large  difference  between  the  predicted  pattern  from  the  baseline  to  the 
excursion.  This  was  reflected  in  the  significant  decrease  in  the  Point-by-Point  test 
from  <voi  for  the  baseline  to  2.28  for  the  excursion. 

The  Point-by-Point  was  even  sensitive  to  subtle  changes.  Trial  A3  is  used 
as  an  example.  Listed  are  the  Point-by-Point  values  for  each  A3  excursion  at 


Imj/m*  and  greater: 

Baseline 

3.01 

Excursion  1 

2.94 

Excursion2 

2.79 

Excursion3 

2.71 

Excursion4 

2.81 

Excursions 

2.74 

Figures  A-16  through  A-33  show  the  baseline  and  excursions  contours.  In  some 
cases,  the  change  in  shape  and  area  was  not  visually  obvious  from  one  excursion 


POINT-BY-POINi  vs.  DEPOSITION  RATIO 


Figure  1,  Point'by>Point  vs  Deposition  Ratio 
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to  the  next,  but  the  PoInt-by-Point  test  was  able  to  make  the  distinction. 

Even  when  the  experimental  data  was  highly  discontinuous,  as  in  trial  N3, 
the  Point-by-Point  siiil  provided  insight  about  the  fit.  When  the  entire  data  set 
was  under  consideration,  the  x  values  for  N3  are  3.23  and  3.32  for  the  baseline  and 
an  excursion,  respectively.  Figures  A-40  through  A-45  represent  the  contours  for 
the  baseline  and  excursion  at  the  1,10,  and  40mp/in®  level.  The  z  values  for  the 
baseline  and  excursion  at  i,  10,  40,  and  lOQmg/tn^  were  as  follows: 

1  10  40  100 

Baseline  3.23  2.45  2.22  2.30 

Excursion  3.32  2.31  2.08  2.13 

Upon  examination  of  the  contour  plots  and  area  coverage  ratios,  it  was  deter¬ 
mined  that  the  z  values  accurately  portrayed  the  baseline/excursion  comparison 
at  each  level.  Area  coverage  ratios  of  predicted  to  experimental  data  are  as  fol¬ 
lows: 


1 

10 

40 

100 

Baseline 

2.4 

0.2 

1.8 

1.9 

Excursion 

2.3 

0.2 

1.8 

1.9 

Overall  the  excursion  was  considered  a  better  fit. 

In  conclusion,  the  Point-by-Point  metric  was  the  only  test  considered  in 
determining  a  best-fit  because  it  included  consideration  of  area,  shape,  and  depo¬ 
sition.  Results  obtained  using  the  data  from  the  field  trials  showed  that  the  test 
was  sensitive  to  changes,  substantial  or  otherwise,  in  the  predicted  data.  Even  for 
trials  where  the  experimental  data  was  highly  irregular,  the  Point-by-Point  metric 
proved  to  be  useful  by  examining  the  results  at  each  deposition  level.  The  Point- 
by-Point  produced  results  that  were  consistent  with  visual  inspection  for  all 
eleven  trials. 


V.  Supporting  Evaluation  Methods 

As  stated  throughout  this  report,  visual  comparison  of  the  patterns  pro¬ 
vided  a  means  for  evaluating  the  quantitative  tests.  The  graphical  output  of  the 
predicted  and  experimental  contours  at  user  specified  deposition  levels  aided  in 
verifying  the  results  of  the  tests.  It  was  especially  useful  in  comparing  the  shapes 
of  the  patterns.  Visual  inspection  is  a  tool  used  to  chcose  NUSSE3  input  varia¬ 
tions  and  to  portray  the  information  provided  by  the  quantitative  tests. 

Area  coverage  was  also  evaluated  by  examining  the  predicted/experimental 
area  ratio.  In  this  analysis,  the  area  ratios  for  NUSSE3  predictions  to  experimen¬ 
tal  data  were  examined  at  four  deposition  levels  (l,  10, 40,  and  lOOmg/m^). 
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Area/deposition  (A/D)  plots  graphically  represent  area  coverage  for  both 
predicted  and  experimental  data  (examples  of  A/D  plots  can  be  found  in  Appen¬ 
dix  B).  Comparison  of  the  predicted  to  the  experimental  area  coverage  at  various 
deposition  levels  can  easily  be  examined  through  the  area  ratios  and  plots.  For 
example,  Trial  A3  area  ratios  are  as  follows: 


1 

10 

40 

100 

Baseline 

2.1 

1.3 

1.6 

2.1 

Excursion  1 

2.0 

1.3 

1.8 

2.7 

Exc’jrsion2 

2.0 

1.4 

1.8 

2.8 

Excursion3 

2.0 

1.3 

1.8 

2.6 

Excursion4 

2.2 

1.5 

1.7 

1.9 

Excursions 

2.2 

1.3 

1.6 

2.2 

WTien  evaluating  this  information,  less  emphasis  was  placed  on  the 
I  and  100m deposition  levels  because  of  the  accuracy  at  which  these  levels  can 
be  measured.  Most  importantly,  the  deposition  of  concern  for  this  analysis  was 
the  40mj/m2glevel.  (Predicting  the  area  within  a  100%  error  was  considered  a  good 
prediction.) 

Even  though  the  Point-by-Polnt  test  includes  consideration  of  area  cover¬ 
age,  the  area  ratios  give  specific  information  to  better  evaluate  the  NUSSE3  pred¬ 
iction.  Visual  inspection  of  the  contours  was  useful  in  evaluating  shape.  There¬ 
fore,  the  area  coverage  ratios  and  visual  inspection  will  be  used  in  support  of  the 
Point-by-Point  test  to  achieve  the  best-fit. 


\T.  SUMMARY 

Analyzing  multiple  chemical  munitions  presented  problems  that  available 
single  munition  techniques  could  not  address.  To  resolve  this,  the  methods  were 
re-examineu  and  new  techniques  developed  that  could  be  applied  to  both  single 
and  multiple  munition  analyses.  Several  quantitative  tests  were  considered  and 
tested  using  field  trial  data.  This  methodology  is  embodied  in  a  computer  code 
called  NfULTl. 

Comparing  area,  shape  and  deposition  of  the  chemical  patterns  was  primary 
in  deciding  which  test  would  be  included  in  the  best-fit  criterion.  A  best-fit  cri¬ 
terion  was  established  based  on  the  results  obtained  from  the  tests.  Only  one  of 
the  tests  addressed  all  three  points:  the  Point-by-Point.  Therefore,  the  Point-by- 


6.  R.  Saucier,  USABRL,  private  communication. 
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Point  is  the  primary  metric  used  to  determine  a  best-fit. 

Visual  inspection  of  the  contour  plots  aids  in  deciding  which  NUSSE3 
inputs  to  vary,  as  well  as  used  to  achieve  a  best-fit.  The  ratios  of  the  area  are 
also  examined  to  ensure  these  values  are  within  a  reasonable  (100%)  error;  pri¬ 
marily  at  the  deposition  of  interest.  The  Sign,  Mann-Whitney  and  Chi-Square 
tests  were  not  appropriate  as  a  best-fit  criterion  but  could  provide  insight  into 
certain  characteristics  about  the  data  sets. 

For  this  analysis,  consistent  results  were  obtained  by  use  of  this  criterion. 
After  examining  all  the  tests  and  the  data  used  in  the  evaluation,  it  is  concluded 
that  evaluating  the  quantitative  test,  visual  inspection,  and  area  coverage  simul¬ 
taneously  is  essential.  These  tests  used  in  conjunction  provide  the  necessary 
information  to  perform  a  predicted  to  experimental  data  analysis  of  chemical 
munitions. 
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APPENDIX  A 


Contour  Plots 


Each  plot  depicts  the  experimental  and  predicted  data  at  a  specific  deposition 
level  for  each  trial.  The  crosshatched  area  represents  the  experimental  data  and 
the  unfilled  contour  the  predicted.  Listed  are  the  Figures: 


Figure  A-1 
Figures  A-2  -  A-4 
Figures  A-5  -  A-7 
Figures  A-8  -  A- 10 
Figures  A-1 1  -  A- 13 
Figure 
Figure  A-15 
Figures  A-16  -  A-18 
Figures  A- 19  -  .‘\-21 
h'igures  A-22  -  A-24 
Figures  A-25  -  A-27 
Figures  A-2S  -  A-30 
Figures  A-31  -  A-33 
Figures  A-34  -  A-36 
F  igures  A-37  -  A-39 
Figures  A-40  -  A-42 
Figures  A-43  -  A-45 


Dugway  Test  Grid 
Baseline  Trial  N4 
Excursion  Trial  N4 
Baseline  Trial  A1 
Excursion  Trial  A1 
Baseline  Trial  N6 
Excursion  Trial  N6 
Baseline  Trial  A3 
Excursion  1  Trial  A3 
Excursion  2  Trial  A3 
Excursion  3  Trial  A3 
Excursion  4  Trial  A3 
Excursion  5  Trial  A3 
Baseline  Trial  N8 
Excursion  Trial  N8 
Baseline  Trial  N3 
Excursion  Trial  N3 


A-1 


Figure  A-l  Dugway  Test  Grid 


A-2 


Trial  N4,  1  mg/m^ 
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Figure  A-4  Baseline  Trial  N4,  40  mg/m^ 
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Figure  A-6 


Excursion  Trial  N4,  10  mg/m^ 
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Figure  A-8  Baseline  Trial  A1,  Img/m^ 
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Figure  A-12  Excursion  Trial  A1,  10  mg/m^ 
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Figure  A-16  Baseline  Trial  A3,  1  mg/m^ 
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Figure  A-21 


Excursion  1  Trial  A3,  40  mg/m2 
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Figure  A-25  Excursion  3  Trial  A3,  1  mg/m^ 
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Figure  A-29  Excursion  4  Trial  A3,  10  mg/m2 
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Figure  A-36  Baseline  Trial  N8,  40mg/m2 
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Figure  A-40  Baseline  Trial  N3,  1mg/m2 
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Figure  A-44  Excursion  Trial  N3,  10  mg/m^ 
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Figure  A-45  Excursion  Trial  N3,  40  mg/m^ 
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APPENDIX  B 


A/D  Plots 

Each  plot  depicts  the  experimental  and  predicted  data  for  the  baseline  and  excur¬ 
sion  of  each  trial. 


Figures  B-1  -  B-2 
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Figures  B-3  -  B-4 

Trial  A1 

Figures  B-5  -  B-6 
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Figures  B-7  -  B-1 2 

Trial  A3 

Figures  B-1 3  -  B-1-1 

Trial  N8 

Figures  B-15  -  B-16 

Trial  N3 
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